You Are Not Expected to Understand This by Bosch Torie;Chudler Kelly;Ullman Ellen;

You Are Not Expected to Understand This by Bosch Torie;Chudler Kelly;Ullman Ellen;

Author:Bosch, Torie;Chudler, Kelly;Ullman, Ellen;
Language: eng
Format: epub
Publisher: Princeton University Press
Published: 2022-08-25T00:00:00+00:00


17

Needles in the World’s Biggest Haystack

The Algorithm That Ranked the Internet

John MacCormick

We’ve all experienced, from time to time, a kind of compulsive web surfing in which we follow link after link, browsing content that becomes less and less relevant to the task at hand. This happened to me only yesterday: while working on some artificial intelligence research, I clicked on something interesting, followed a few links, and 20 minutes later found I was deep into an article about the human brain and consciousness. Strangely enough, this “random surfer” model of Internet browsing also lies at the heart of one of the most revolutionary pieces of code to impact the Internet age: Google’s PageRank algorithm.

It is widely believed that the PageRank algorithm, invented and first published by Google cofounders Sergey Brin and Larry Page in 1998, was the single most important element in launching the Google search engine to its dominance of the emerging web search industry in the early 2000s. Around this time, Google leapfrogged some established players such as Lycos and AltaVista, which have since faded into obscurity. How and why did this happen? The key insight of the Google cofounders was that a web search engine would live or die according to the quality of its ranking of search hits. The technology of crawling and indexing the entire web was already well understood—Lycos, AltaVista, and others had mastered that. The problem was that most search queries would overwhelm the user with far too many hits. For example, if I search the Web these days for “field hockey,” there are more than 300 million hits. This is only a tiny fraction of the entire Web, but still far too large to be a useful set of results. A good search engine, therefore, needs to rank those 300 million pages. Ideally, they would be ranked so that the top three to five search results are highly authoritative and informative about field hockey. With their 1998 PageRank algorithm, Brin and Page thought they had figured out a way to find the most authoritative and informative pages automatically—and the public voted with their mouse clicks. Google’s results were far more relevant than those of competitors such as Lycos and AltaVista. Google’s market share soared, and a twenty-first-century Internet giant was born.

Code that simulates Google’s PageRank algorithm.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.